542 research outputs found
Recommended from our members
Interaction-Based Learning for High-Dimensional Data with Continuous Predictors
High-dimensional data, such as that relating to gene expression in microarray experiments, may contain substantial amount of useful information to be explored. However, the information, relevant variables and their joint interactions are usually diluted by noise due to a large number of non-informative variables. Consequently, variable selection plays a pivotal role for learning in high dimensional problems. Most of the traditional feature selection methods, such as Pearson's correlation between response and predictors, stepwise linear regressions and LASSO are among the popular linear methods. These methods are effective in identifying linear marginal effect but are limited in detecting non-linear or higher order interaction effects. It is well known that epistasis (gene - gene interactions) may play an important role in gene expression where unknown functional forms are difficult to identify. In this thesis, we propose a novel nonparametric measure to first screen and do feature selection based on information from nearest neighborhoods. The method is inspired by Lo and Zheng's earlier work (2002) on detecting interactions for discrete predictors. We apply a backward elimination algorithm based on this measure which leads to the identification of many in influential clusters of variables. Those identified groups of variables can capture both marginal and interactive effects. Second, each identified cluster has the potential to perform predictions and classifications more accurately. We also study procedures how to combine these groups of individual classifiers to form a final predictor. Through simulation and real data analysis, the proposed measure is capable of identifying important variable sets and patterns including higher-order interaction sets. The proposed procedure outperforms existing methods in three different microarray datasets. Moreover, the nonparametric measure is quite flexible and can be easily extended and applied to other areas of high-dimensional data and studies
Recommended from our members
New insights into old methods for identifying causal rare variants
The advance of high-throughput next-generation sequencing technology makes possible the analysis of rare variants. However, the investigation of rare variants in unrelated-individuals data sets faces the challenge of low power, and most methods circumvent the difficulty by using various collapsing procedures based on genes, pathways, or gene clusters. We suggest a new way to identify causal rare variants using the F-statistic and sliced inverse regression. The procedure is tested on the data set provided by the Genetic Analysis Workshop 17 (GAW17). After preliminary data reduction, we ranked markers according to their F-statistic values. Top-ranked markers were then subjected to sliced inverse regression, and those with higher absolute coefficients in the most significant sliced inverse regression direction were selected. The procedure yields good false discovery rates for the GAW17 data and thus is a promising method for future study on rare variants
Simulating the Storage and the Blockage Effects of Buildings in Urban Flood Modeling
Buildings often affect overland flow propagation in urban areas. Building walls change the direction and velocity of flow and can exclude interior spaces from flooding. However, water may intrude buildings when the flood level exceeds the height of protection. This study develops an inundation model that represents the resistance and the storage effects of buildings. This model was applied to central Taipei City, which is surrounded by the Danshui and Keelung Rivers. The inundation depth and extent were compared from models where the effects of buildings were included and excluded. Rainfall data from the Typhoon Nari event in 2001 was used in the simulation. The results showed that in the case where the effects of buildings were excluded inundation was underestimated in the metropolitan areas. Where the effects of buildings were considered in the model, the presented inundation model reproduces the inundation results more comparable with the observed flooding situation
Ontology-based Fuzzy Markup Language Agent for Student and Robot Co-Learning
An intelligent robot agent based on domain ontology, machine learning
mechanism, and Fuzzy Markup Language (FML) for students and robot co-learning
is presented in this paper. The machine-human co-learning model is established
to help various students learn the mathematical concepts based on their
learning ability and performance. Meanwhile, the robot acts as a teacher's
assistant to co-learn with children in the class. The FML-based knowledge base
and rule base are embedded in the robot so that the teachers can get feedback
from the robot on whether students make progress or not. Next, we inferred
students' learning performance based on learning content's difficulty and
students' ability, concentration level, as well as teamwork sprit in the class.
Experimental results show that learning with the robot is helpful for
disadvantaged and below-basic children. Moreover, the accuracy of the
intelligent FML-based agent for student learning is increased after machine
learning mechanism.Comment: This paper is submitted to IEEE WCCI 2018 Conference for revie
Recommended from our members
Inhibition of Serine Protease Activity Protects Against High Fat Diet-Induced Inflammation and Insulin Resistance.
Recent evidence suggests that enhanced protease-mediated inflammation may promote insulin resistance and result in diabetes. This study tested the hypothesis that serine protease plays a pivotal role in type 2 diabetes, and inhibition of serine protease activity prevents hyperglycemia in diabetic animals by modulating insulin signaling pathway. We conducted a single-center, cross-sectional study with 30 healthy controls and 57 patients with type 2 diabetes to compare plasma protease activities and inflammation marker between groups. Correlations of plasma total and serine protease activities with variables were calculated. In an in-vivo study, LDLR-/- mice were divided into normal chow diet, high-fat diet (HFD), and HFD with selective serine protease inhibition groups to examine the differences of obesity, blood glucose level, insulin resistance and serine protease activity among groups. Compared with controls, diabetic patients had significantly increased plasma total protease, serine protease activities, and also elevated inflammatory cytokines. Plasma serine protease activity was positively correlated with body mass index, hemoglobin A1c, homeostasis model assessment-insulin resistance index (HOMA-IR), tumor necrosis factor-α, and negatively with adiponectin concentration. In the animal study, administration of HFD progressively increased body weight, fasting glucose level, HOMA-IR, and upregulated serine protease activity. Furthermore, in-vivo serine protease inhibition significantly suppressed systemic inflammation, reduced fasting glucose level, and improved insulin resistance, and these effects probably mediated by modulating insulin receptor and cytokine expression in visceral adipose tissue. Our findings support the serine protease may play an important role in type 2 diabetes and suggest a rationale for a therapeutic strategy targeting serine protease for clinical prevention of type 2 diabetes
Associations between blood glucose level and outcomes of adult in-hospital cardiac arrest: a retrospective cohort study
Additional file 3: Table S3. Features, interventions, and outcomes of cardiac arrest events stratified by the presence of measurement of blood glucose level after sustained return of spontaneous circulation
- …